Functions
The sections below describe the SQLite functions you can use to build an expression with the Expression builder. The Expression builder divides the functions into the following categories:
- Strings
- LOBs
- Numeric
- NULL check
- Date and Time
- Data Enrichment
- Operation
- Other Functions
- Hash
- User-defined transformations
Strings
The following table describes the string functions used by the Expression Builder in Enterprise Manager.
Function |
Description |
---|---|
lower(x) |
The lower(x) function returns a copy of string x with all characters converted to lower case. The default built-in lower() function works for ASCII characters only. |
ltrim(x,y) |
The ltrim(x,y) function returns a string formed by removing all characters that appear in y from the left side of x. If there is no value for y, ltrim(x) removes spaces from the left side of x. |
replace(x,y,z) |
The replace(x,y,z) function returns a string formed by substituting string z for every occurrence of string y in string x. |
rtrim(x,y) |
The rtrim(x,y) function returns a string formed by removing all characters that appear in y from the right side of x. If there is no value for y, rtrim(x) removes spaces from the right side of x. |
substr(x,y,z) |
The substr(x,y,z) function returns a substring of input string x that begins with the y-th character and which is z characters long. If z is omitted then substr(x,y) returns all characters through the end of the string x beginning with the y-th. The left-most character of x is number 1. If y is negative then the first character of the substring is found by counting from the right rather than the left. If z is negative then the abs(z) characters preceding the y-th character are returned. If x is a string then characters indices refer to actual UTF-8 characters. If x is a BLOB then the indices refer to bytes. |
trim(x,y) |
The trim(x,y) function returns a string formed by removing all characters that appear in y from both sides of x. If there is no value for y, trim(x) removes spaces from both sides of x. |
replaceChars(X,Y,Z) |
The replaceChars(X,Y,Z) function replaces any character in string X that also exists in string Y (characters to be replaced) with Z (replacement characters) in the same position. This is especially useful for removing non-valid characters from paths and file names.
So, for example, specifying replaceChars("abcde","abcd","123") would return 1231e. |
LOBs
The following table describes the LOB functions used by the Expression Builder in Enterprise Manager.
Function |
Description |
---|---|
hex(x) |
The hex() function receives an argument as a BLOB and returns an upper-case hexadecimal string version of the BLOB content. |
randomblob(N) |
The randomblob(N) function returns an N-byte BLOB that contains pseudo-random bytes. If N is less than 1 then a 1-byte random BLOB is returned. |
zeroblob(N) |
The zeroblob(N) function returns a BLOB that consists of N bytes of 0x00. |
Numeric
The following table describes the numeric functions used by the Expression Builder in Enterprise Manager.
Function |
Description |
---|---|
abs(x) |
The abs(x) function returns the absolute value of the numeric argument X. Abs(x) returns NULL if x is NULL. Abs(x) returns 0.0 if x is a string or BLOB that cannot be converted to a numeric value. |
random() |
The random() function returns a pseudo-random integer between -9223372036854775808 and +9223372036854775807. |
round(x,y) |
The round(x,y) function returns a floating-point value x rounded to y digits to the right of the decimal point. If there is no value for y, it is assumed to be 0. |
max(x,y...) |
The multi-argument max() function returns the argument with the maximum value, or returns NULL if any argument is NULL. The multi-argument max() function searches its arguments from left to right for an argument that defines a collating function and uses that collating function for all string comparisons. If none of the arguments to max() define a collating function, then the BINARY collating function is used. Note that max() is a simple function when it has two or more arguments but operates as an aggregate function if it has a single argument. |
min(x,y...) |
The multi-argument min() function returns the argument with the minimum value. The multi-argument min() function searches its arguments from left to right for an argument that defines a collating function and uses that collating function for all string comparisons. If none of the arguments to min() define a collating function, then the BINARY collating function is used. Note that min() is a simple function when it has two or more arguments but operates as an aggregate function if it has a single argument |
NULL check
The following table describes the NULL check functions used by the Expression Builder in Enterprise Manager.
Function |
Description |
---|---|
coalesce(x,y...) |
The coalesce() function returns a copy of its first non-NULL argument, it returns NULL if all arguments are NULL. Coalesce() have at least two arguments. |
ifnull(x,y) |
The ifnull() function returns a copy of its first non-NULL argument, it returns NULL if both arguments are NULL. Ifnull() must have exactly two arguments. The ifnull() function is the same as coalesce() with two arguments. |
nullif(x,y) |
The nullif(x,y) function returns a copy of its first argument if the arguments are different and returns NULL if the arguments are the same. The nullif(x,y) function searches its arguments from left to right for an argument that defines a collating function and uses that collating function for all string comparisons. If neither argument to nullif() defines a collating function then the BINARY is used. |
Date and Time
The following table describes the Date and Time functions used by the Expression Builder in Enterprise Manager.
Function |
Description |
---|---|
date(timestring, modifier, modifier,...) |
Returns the date in the format YYYY-MM-DD. |
time(timestring, modifier, modifier,...) |
Returns the time in the format HH:MM:SS. |
datetime(timestring, modifier, modifier,...) |
Returns the date and time in the format YYYY-MM-DD HH:MM:SS. |
julianday(timestring, modifier, modifier,...) |
The julianday() function returns the number of days since noon in Greenwich on November 24, 4714 B.C. |
unixepoch(time-value, modifier, modifier,...) | The unixepoch() function returns a unix timestamp - the number of seconds since 1970-01-01 00:00:00 UTC. The unixepoch() always returns an integer, even if the input time-value has millisecond precision. |
strftime(format, timestring, modifier, modifier...) |
The strftime() routine returns the date formatted according to the format string specified as the first argument. It supports the following variables: %d: day of month %H: hour 00-24 %f: ** fractional seconds SS.SSS %j: day of year 001-366 %J: ** Julian day number %m: month 01-12 %M: minute 00-59 %s: seconds since 1970-01-01 %S: seconds 00-59 %w: day of week 0-6 sunday==0 %W: week of year 00-53 %Y: year 0000-9999 %%: % |
Examples
The expression builder provides you with a variety of options to build your own expression. You can use the regular operators with a date as well:
$HIRE_DATE < '2022-02-28'
Note that the right operand of the operator is contained between single quotes to prevent it being treated as a mathematical expression ( 2022 minus 2 minus 28).
Advanced examples using SQLite functions, modifiers and other operators
$HIRE_DATE < date('2022-02-28')
$DATE_1 < '2022-03-01' OR $DATE_1 > '2022-08-01'
$DATE_2 >= date('2022-03-01') AND $DATE_2 <= date('2022-08-01')
$HIRE_DATE < date('now','localtime','-1 year')
$HIRE_DATE <= date('now','utc')
$HIRE_DATE <= date('2022-02-28','+1 months')
$DATE_1 <= date($DATE_2,'+1 months')
List of valid modifiers in SQLite
The 'now' time string is the current date/datetime. You can specify an explicit date '2022-02-28' or you can use the date from a table column.
You can also apply multiple modifiers such as datetime('now', '-3 hours','+1 months'), for example.
Modifier | Example | Result |
---|---|---|
datetime('now') | 2020-04-26 00:53:53 | |
NNN days | date('now', '+3 days') | 2020-04-29 |
NNN hours | datetime('now', '-3 hours') | 2020-04-26 03:53:53 |
NNN minutes | datetime('now', '+3 minutes') | 2020-04-26 00:56:53 |
NNN.NNNN seconds | datetime('now', '-30 seconds') | 2020-04-26 00:54:23 |
NNN months | date('now', '+3 months') | 2020-07-26 |
NNN years | date('now', '-3 years') | 2017-04-26 |
start of month | date('now', 'start of month') | 2020-04-01 |
start of year | date('now', 'start of year') | 2020-01-01 |
start of day | datetime('now', 'start of day') | 2020-04-26 00:00:00 |
weekday N | date('now', 'weekday 6') | 2020-05-02 |
unixepoch | datetime('1588965525', 'unixepoch') | 2020-05-08 19:18:45 |
localtime | datetime('now', 'localtime') | 2020-04-26 10:53:53 |
utc | datetime('now', 'utc') | 2020-04-25 08:53:53 |
Data Enrichment
Data Enrichment functions allow the selected source tables to be augmented with data from other records located in either the source or target endpoints. Practical applications of data enrichment functions include code lookup or master record lookup (e.g. social security number lookup to find a person’s name).
You can enrich the target tables with supplemental data retrieved from the source or target endpoint by defining a transformation on the table. For more information about defining transformations on a single table, see Defining transformations for a single table/view.
Limitations
Amazon Redshift is not supported.
Data Enrichment functions
The table below describes the source and target lookup functions, which can be used both for table transformations and for global transformations. For a description of the parameters available for these functions, see Input Parameters below.
Function |
Description |
---|---|
source_lookup(TTL,'SCHM','TBL','EXP','COND', COND_PARAMS) |
Use to retrieve additional data from the source endpoint. |
target_lookup(TTL,'SCHM','TBL','EXP','COND', COND_PARAMS) |
Use to retrieve additional data from the target endpoint. |
Input parameters
The possible input parameters for the lookup functions are described in the table below. For a usage example, see Data Enrichment example.
Function |
Description |
---|---|
TTL |
TTL (Time to Live) is the amount of time the 'COND' return value will be cached. Caching the 'COND' return value improves performance by reducing the frequency that Enterprise Manager needs to access the source/target endpoint. As there is no default, you must specify a TTL value, which can be one of the following: <SECONDS> - The time to cache the 'COND' return value in seconds. Specify a short caching time (e.g. 3) for data that is frequently updated or a long caching time for data that rarely changes. 'NO_CACHING'- Specify 'NO_CACHING' if you do not want to cache the 'COND' return value. This is recommended for data that is constantly updated (e.g. share prices). 'NO_EXPIRATION'- For data that is never updated (e.g. a street name), specify 'NO_EXPIRATION' to store the Functions return value permanently in the cache. |
'SCHM' |
The schema name. |
'TBL' |
The table on which to perform the lookup. |
'EXP' |
The expression to retrieve data from the lookup table. Note: The expression syntax must be native to the endpoint it accesses. The result should be a single column. Possible expressions include: col1, col1+5, max(col1). Note: Full LOB columns are not supported. For information on including Limited-size LOB columns in the replication, see the description of the Metadata tab. Columns (transformations and filters only), Headers, and Metadata (Global transformations only) can also be used in the expression and are evaluated before the lookup statement is performed against the endpoint. |
The condition for the lookup statement. Note: The condition syntax must be native to the endpoint it accesses. The COND is a single field referencing all required fields. Example if the lookup table is located in Oracle: 'Fieldname1=:1 and Fieldname2=:2 and Fieldname3 =:3' Example if the lookup table is located in Microsoft SQL Server: 'Fieldname1=? and Fieldname2=? and Fieldname3=?' Columns (transformations and filters only), Headers, and Metadata (Global transformations only) can also be used in the expression and are evaluated before the lookup statement is performed against the endpoint. |
|
COND_PARAMS |
Any parameters required by the COND parameter. The COND_PARAMS (condition parameters) is not a single field, but a list of fields. Syntax: $FIELDNAME1 , $FIELDNAME2 , $FIELDNAME3 Full example: source_lookup( 10000 , 'HR' , 'DEPARTMENTS' , 'DEPARTMENT_NAME’ , 'COMPANY_ID=? and DIVISION_ID=? and DEPT_ID=?' , $COMP_ID , $DIV_ID , $DEPT_ID ) |
To improve efficiency, the source/target lookup tables should be indexed for the specified lookup fields.
Data Enrichment example
In the following example, Mike needs to add the DEPARTMENT_NAME column to the HR.JOB_HISTORY table. The DEPARTMENT_NAME column is located in the HR.DEPARTMENTS table in the source endpoint.
This is how the HR.JOB_HISTORY table appears before the column is added:
This is how the HR.JOB_HISTORY table appears after the Full Load completes:
To add the DEPARTMENT_NAME column, Mike needs to:
- Create a new task and select the HR.JOB_HISTORY table for replication.
- Apply a “New Column” transformation to the HR.JOB_HISTORY table. For more information on defining transformations, see Defining transformations for a single table/view.
- Open the Expression Builder and choose Data Enrichment from the Functions tab. For more information on the Expression Builder, see Using the Expression Builder.
-
Select the source_lookup function and configure it as follows (using the native syntax of the source endpoint):
If the lookup table is located in Oracle:
source_lookup(10000,'HR','DEPARTMENTS','DEPARTMENT_NAME',
'DEPARTMENT_ID=:1',$DEPARTMENT_ID)
If the lookup table is located in Microsoft SQL Server:
source_lookup
(10000,'HR','DEPARTMENTS','[DEPARTMENT_NAME]',
'[DEPARTMENT]=?',$DEPARTMENT_ID)
Where:
- 10000 is the TTL parameter.
- HR is the schema name.
- DEPARTMENTS is the table name.
- DEPARTMENT_NAME is the expression.
- DEPARTMENT_ID=:1 (or ? on Microsoft SQL Server) is the condition.
- $DEPARTMENT_ID is the condition parameter.
- Run the task.
Operation
The following table describes the Operation functions used by the Expression Builder in Enterprise Manager.
Function |
Description |
---|---|
operation_indicator(value_on_delete, value_on_update, value_on_insert) |
When the operation_indicator function is invoked on its own or as part of an expression, records deleted from the source endpoint will not be deleted from the target endpoint. Instead, the corresponding target record will be flagged (with a user-provided value) to indicate that it was deleted from the source. The operation_indicator function also requires you to provide values to indicate records that were inserted or updated in the source endpoint. Note: The operation_indicator function is not supported on tables that do not have a Primary Key. Note: It is recommended to add a dedicated column for the flag values, for example, OPERATION. For an explanation of how to add a column, see Using the Transform tab. To specify the function values: Replace value_on_delete, value_on_insert and value_on_update with the values that you want to appear in the target endpoint. Values should be formatted according to the corresponding column type. Example when the column type is INT4: operation_indicator(’1’, ’0’, ’0’) Example when the column type is STRING: operation_indicator(’Deleted’, ’Updated’, ’Inserted’) |
Other Functions
The following table describes additional functions used by the Expression Builder in Enterprise Manager.
Function |
Description |
---|---|
length(x) |
For a string value x, the length(x) function returns the number of characters (not bytes) in x before to the first NULL character. If x is NULL then length(x) is NULL. If x is numeric then length(X) returns the length of a string representation of X. |
like(x,y,z) |
The like() function is used to implement the "Y LIKE X [ESCAPE Z]" expression. The ESCAPE (z) clause is optional. If there is a z clause, then the like() function is invoked with three arguments. Otherwise, it is invoked with two arguments. |
typeof(x) |
The typeof(x) function returns a string that indicates the datatype of the expression x: null, integer, real, text, or BLOB. |
Hash
The Hash function generates a hash value for an inputted column (using the SHA-256 algorithm) and then returns the hex value of the generated hash value.
To use the function in an expression, add the hash_sha256(x) function to the Build Expression pane and then replace the "x" with the desired source column name (from the Input Columns tab).
The function is especially useful for masking sensitive information. In the expression below, for example, the Hash function has been used to obfuscate employees' email addresses.